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ABSTRACT 

'A 

Two  simple  stationary  processes  of  discrete  random 
variables  with  arbitrarily  chosen  first-order  marginal 
distributions,  DARMA(p,N+l)  and  NDARMA(p,N) ,  are  given.  The 
correlation  structure  of  these  processes  mimics  that  of  the 
usual  linear  ARMA(p,q)  processes.  The  relationship  of  these 
processes  to  mover-stayer  models,  and  to  models  for  discrete 
time  series  given  separately  by  Lindqvist  and  Pegram  is  discussed. 
Ad-hoc  nonparametric  estimators  for  the  parameters  in  the 
DARMA(p,N+l)  and  NDABMA(p,N)  are  given.  A  simulation  study 
shows  them  to  be  as  good  as  maximum  likelihood  estimators  for 
the  first-order  autoregressive  case,  and  to  be  much  simpler  to 
compute  than  the  maximum  likelihood  estimators. 
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1 .  INTRODUCTION 


Discrete  time  series  arise  in  many  different  contexts. 

For  example  the  exact  arrival  times  in  an  arrival  process  are 
usually  not  measured.  Instead  the  number  of  arrivals  in  suc¬ 
cessive  time  intervals  are  given.  This  is  the  case  with  the 
statistics  published  by  the  Center  for  Disease  Control  on  the 
incidence  of  various  diseases  in  the  United  States.  The  data 
are  given  as  the  number  of  occurrences  of  each  disease  in  suc¬ 
cessive  days.  If  the  time  intervals  are  short  enough  and  the 
arrival  process  is  orderly,  then  the  resulting  time  series  is 
approximately  binary.  In  other  instances  the  process  that  is 
being  measured  is  continuous  but  the  data  is  quantized  in 
recording.  For  example,  the  amount  of  rainfall  in  a  day 
(24  hours)  at  a  location,  given  that  some  occurs,  is  a  continu¬ 
ous  random  variable;  however,  it  is  often  recorded  to  the 
nearest  one-hundredth  or  one-tenth  of  an  inch.  Also,  since  a 
rainfall  series  will  often  contain  many  zeros  (no  rain) ,  an 
analysis  is  often  made  of  successive  wet  and  dry  days  which  is 
a  binary  time  series  [cf.  Buishand,  1978].  An  economic  impera¬ 
tive  for  modelling  and  predicting  the  binary  rainfall  series  is 
that  it  is  the  primary  concomittant  variable  for  predicting 
volume  of  business  done  in  some  establishments  on  successive  days 
Markov  chains  have  been  used  as  models  for  stationary 
discrete  time  series.  However,  they  are  overparametrized  for 
statistical  purposes.  Further,  the  data  to  be  modelled  can 
often  be  shown  to  be  non-Markovian,  or  at  least  not  first-order 
Markovian.  Higher  order  Markov  chains  can  be  used  but  this  only 
aggravates  the  problem  of  overparametrization. 
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In  the  past  several  years  various  pareunetrically  simple 
models  have  been  proposed  for  stationary  discrete  time  series. 

The  models  have  as  parameters  the  fixed,  first-order  marginal 
distribution  of  the  time  series  and  the  correlation  structure. 

In  Jacobs  and  Lewis  [1978a,  1978b]  a  simple  scheme  is  given  for 
obtaining  a  stationary  sequence  of  discrete  random  variables  with 
a  given  marginal  probability  mass  function  tt  and  an  autocor¬ 
relation  structure  like  that  of  a  mixed  first-order  autoregres¬ 
sive-  (N+1)  st-order  moving  average  process.  This  DARMAd,  N+1) 
process  has  nonnegative  correlations  and  a  possibly  countably 
infinite  state  space.  The  correlation  structure  is  determined  by 
parameters  that  are  independent  of  the  marginal  distribution. 

A  special  case  of  the  DARMAd,  N+1)  process  with  mar¬ 
ginal  probability  mass  function  ir  is  the  DARd)  process. 

This  is  a  Markov  chain  with  discrete  state  space  IE  and  with 
transition  matrix 

(1.1)  P  =  pi  +  d-p)Q  , 

where  Q  is  a  matrix  with  Q^j  =  irCj)  for  i,j€lE  ;  ^  is  the 

identity  matrix  with  (i,j)  element  Ij^j  and  0  £  p  <  1  .  The 
correlation  structure  of  a  real  valued  DARd)  i.e.  one  for  which 
IE  is  a  subset  of  the  real  line,  is  that  of  a  first-order  auto¬ 
regressive  process  with  kth-order  serial  correlation  equal  to 

k 

p  .  There  is  no  limitation  on  tt  ;  a  common  and  useful  assump¬ 
tion  is  that  it  be  Poisson  and  therefore  have  an  infinite  state 
space.  The  DARd)  model  with  a  finite  state  space  is  a  special 
case  of  the  mover-stayer  model  [Bhat,  1972,  p.  302-9]. 

Lindqvist  [1978]  studied  a  real  valued  finite  state 
space  Markov  chain  with  a  transition  function  of  the  form 
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(1.2) 


ij 


=  cli-  . 


(1-c)  Q. . 

1] 


where  =  tt  (j)  ,  for  i,  j  €]E 

IE  is  finite,  the  constant  c 
values  with  the  constraint  that 


as  before.  Since  the  state  space 
can  take  on  some  negative 


(1.3) 


max  [1  -  (l-ir(i))~^]  1  c  <  1  , 
l<i<r 


where  r  is  the  number  of  elements  in  the  state  space. 

In  Jacobs  and  Lewis  [1978c],  the  DAR(l)  process  was 
extended  to  obtain  a  sequence  of  discrete  random  variables 
with  pth  order  Markov  dependence  and  given  marginal  distri¬ 
bution.  The  DAR(p)  process  is  defined  as  follows.  Let  {V^} 
be  a  sequence  of  independent  identically  distributed  random 
variables  with  =  1^  ~  1  “  ~  0^  ~  1  ~ 

0  _<  p  <  1;  is  a  sequence  of  independent  identically 

distributed  random  variables  taking  values  {l,2,...p}  with 
P{A^  =  i}  =  ,  i  =  l,2,...,p;  and  is  a  sequence  of 

independent  identically  distributed  random  variables  with 

discrete  state  space  IE  and  P{Y  =  i}  =  TT(i).  Let 

n 


(1.4) 


n  n  n— A  n  n 

n 


The  process  is  called  a  DAR(p)  process.  Note  that  by 

direct  argument  from  (1.4) 
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(1.5) 


P{Z 


n+1 


=  j  I  Z  = 
'  n 


=  1, 


'^n-p+1 


P 

=  (1-p)  Tr(j)  +  I  pa,  e  (i  )  , 

k=l  K  :  ic 


where  Cjd)  =1  if  i  =  j  and  e^d)  =  0  otherwise.  Thert; 
is  no  limitation  on  the  marginal  probability  mass  function  n  . 

Pegram  [1980]  considers  a  real-valued  finite  state  space 
model  {Z^}  which  is  a  generalization  of  the  DAR(p)  model  in  that 
its  conditional  probabilities  are  of  the  form 


(1.6) 


=  jlz 


n 


= 


rZ 


n-p+1 


=  [1 


p 


I 

k=l 


Tf(j) 


where  k=l,...,p}  are  (possibly)  negative  constants.  Note 

that  although  some  of  the  constants  may  be  negative,  the 

admissible  values  for  depends  on  the  marginal  distribution 

TT .  It  was  shown  in  Jacobs  and  Lewis,  [1978c]  that 
Corr(Z^,  ,  k=l,2,...  for  the  real  valued  DAR(p)  process 

are  nonnegative.  Pegram 's  model  allows  some  of  the  correlations 
to  be  negative.  The  amount  of  negative  correlation,  as  in 
Lindqvist's  model,  depends  on  the  marginal  distribution  tt. 

In  this  paper  we  will  consider  models  for  real-valued 
stationary  discrete  time  series  whose  nonnegative  correlation 
structure  is  that  of  a  mixed  pth-order  autoregressive  and  qth-order 
moving  average  process.  Thus  we  have  a  generalization  of  all 
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r 


of  the  preceding  models.  In  Section  2  we  will  give  definitions 
of  two  such  models,  DARMA(p,N+l)  (discrete  mixed  autoregressive- 
moving  average  process  with  orders  p  and  N+1  respectively) 
and  NDARMA(p,N).  We  briefly  describe  some  of  their  properties 
and  suggest  an  estimate  for  the  correlations.  In  Section  3 
we  describe  in  detail  a  simulation  experiment  that  was  done 
to  study  the  behavior  of  various  estimators  for  the  first  order 
serial  correlation  coefficients  p  of  the  DAR(l)  model  for 
small  and  moderate  sample  sizes.  In  Section  4  some  extensions 
of  the  DARMA  models  are  briefly  discussed  including  one  which 
can  have  negative  correlations.  Throughout  the  remainder  of 
the  paper  we  will  assume  that  the  NDARMA  and  DARMA  processes 
are  real  valued.  They  can  in  fact  be  used  to  model  categorical 
time  series,  but  then  numerical  measures  such  as  correlations 
are  meaningless. 
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2.  The  DARMA(p,N+l)  and  NDARMA(p,N)  Processes. 

In  what  follows  we  let  be  a  sequence  of  independ¬ 

ent  identically  distributed  random  variables  taking  values  in 
a  real-valued  discrete  state  space  IE  with  P{Y^  =  i}  =  iT(i), 
icIE  .  Let  and  be  independent  sequences  of  independ¬ 

ent  random  variables  taking  the  values  0  and  1  with 

(2.1)  P{Uj^  =  1}  =  B  and  PIV^^  =  1}  =  p 

for  fixed  0  _<  B  £  1  and  0  £  p  <  1.  Let  a  sequence 

of  independent  identically  distributed  random  variables  taking 

values  0,1,2,...,N  with  =  n}  =  6^^  ,  n  =  0,1,. ..,N,  and 

{A^}  be  a  sequence  of  independent  identically  distributed 

random  variables  taking  values  l,2,...,p  with  P{A,  =  n}  =  a  , 

K  n 

2.1  The  DARMA(p,N+l)  process. 

The  DARMA(p,N+l)  process  is  a  sequence  of  random 
variables  which  is  formed  according  to  the  probabilistic 

linear  model 


(2.2) 


for  n  =  1,2,...,  where  the  "autoregressive  tail"  is 


(2.3) 


Z  =  VZ  ,  +(1-V)Y 

n  n  n-A  n  n 

n 
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for  n  =  -N-p+1,  -N-p+2,..,.  This  process  differs 
from  the  DARMA(1,N+1)  process  defined  in  Jacobs  and  Lewis 
[1978a]  in  that  the  "autoregressive  tail"  is  now  the  pth 

order  autoregressive  process  of  (1.4).  In  Jacobs  and  Lewis 
[1978c],  it  was  shown  that  the  vector-valued  Markov  chain 
{(Zn»  ^n+1'  —  '  ^n-p+1^  '  “  1,2,...}  has  a  limiting  joint 

probability  mass  function  v  with  marginal  probability  mass 
function  tt.  Hence,  if  •  •  •  »  joint  probability 

mass  function  v  ,  then  n  =  1,2,...}  is  a  stationary 

process  with  marginal  probability  mass  function  tt. 

Let  r(k)  =  CorrCX^^,  for  the  stationary  process. 

Then  {r(k)}  can  be  shown  to  satisfy  the  following  system  of 
equations : 

9  N-1 

(2.4)  r(l)  =  3  ^i'^i+1  +  3(l-3)rg(l)  +  (l-3)'^r^(l)  , 

9  N-2 

(2.5)  r(2)  =  3  '5i‘5i+2  3(l-3)r3(2)  +  (l-3)^r^(2)  , 


(2.6) 


r(N)  =  3^6q6jj  +  3(1-3)  rg(N)  +  (l-3)^r^(N)  , 


(2.7)  r(N+k)  =  3(1-3)  r3(N+k)  +  ( 1-3)  ^r^(N+k)  .  k  >  1- 


In  these  equations 


(2.8) 


rA(k) 


Corr (Z^, 


k  >  1  , 
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which  satisfy  the  following  Yule-Walker  equations: 


(2.9) 

r^(l)  =  paj^r^(O)  +  pa2r^(l)  +  . 

..  +  papr^(p-l)  , 

(2.10) 

r^(2)  =  pa^r^(l)  +  pa2r^(0)  +  . 

•  •  • 

..  +  papr^(p-2)  , 

• 

(2.11) 

•  •  • 

r^(p)  =  pa^r^(p-l)  +  pa2r^(p-2) 

• 

+  •  •  •  +  POtpE“^{0)  , 

and  for 

k  >  1, 

(2.12) 

r^(p+k)  =  pa^r^(p+k-l)  +  pa2r^(p+k-2)  +  ...  +  papr^(k)  , 

where  r 

• 

r^ 

II 

O 

< 

In  addition 

rB(i) 

N 

=  Corr(Z^^j_  Y  )  =  J 

n  j=0 

=°"<^n+i-(H+l)-  Vj)«J 

is  obtained  recursively  as 

0  ; 

poj^rgd)  +  (l-p)6jj_j^  ; 

k-1 

P«irB(lc-i)  +  (1-pl  Sb-Oc-I) 

for  k  <  min(p,N)  ; 


TbCO)  = 
Tfid)  = 
rB(2)  = 

rfiCk)  = 
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pa.rgCk-i)  +  (1-p) 


1) 


for  inax(p,N)  >  k  ^  inin(p,N)  ; 


reCk)  =  f  pa^rg(k-i) 


i=l 


for  k  >  max(p,N)  . 


To  see  that  the  serial  correlations  for  the  DARMA(p,w+l) 
process  are  all  nonnegative  let  q(i)  (respectively  q  (i) )  be 
the  probability  that  and  (respectively  and 

choose  the  same  random  variable  Yj^  ,  where,  because  of 
the  backward  definition  of  the  autoregression  k  <  n.  Then 
q(i)  (respectively  q^(i))  also  satisfy  equations  (2.4)  -  (2.7) 
(respectively  (2.9)  ~  (2.12))  and  since  they  are  nonnegative, 
so  are  the  serial  correlations. 

To  see  this  identity,  let  be  the  random  index  of 

the  ,  k  £  n,  that  X^  chooses;  that  is. 


Then,  since  the  random  variables  are  independent  of  the 

random  variables. 
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(2.13) 


Efx^X^.,]  =  eFyp  Y_ 

L  n  n+£j  L  \  \+Sii 


‘  Ji 

n  n+A  r  n 

^  Ji  j-i 

j?^k 

(2.14)  =  e[y2]  p(r^  = 

+  PtR„  /  R„„>  • 

Thus 

'■'"n  =  "n+l* 

4  e[2J2  [p(r„  )<  R„^,)  -  1^ 

(2.15)  =  VarJvjj  P(R„  =  R„^j)  . 

Therefore 

(2.16)  Corr(X„,  X^^j)  -  P{R_,  =  R^^^j)  -  q(l) 
as  asserted  above. 

This  identity  will  also  be  used  in  the  estimation 
procedure  proposed  in  Section  3  for  the  serial  correlations. 
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2.2  The  NDARMA(p,N)  process 


In  this  subsection  we  will  define  another  related 
discrete  time  series  with  the  correlation  structure  of  a  mixed 
moving  average  autoregressive  process.  This  new  process  is 
more  reminiscent  of  the  linear  ARMA(p,N)  process.  The  key  idea 
that  leads  to  this  new  model  is  that  a  probabilistic  mixture  of 
a  finite  number  of  random  variables  each  with  probability  mass 
function  tj  has  probability  mass  function  it  even  if  the 
random  variables  are  dependent.  Thus  it  is  not  necessary  to 
define  the  autoregression  via  an  autoregressive  tail,  as  in  the 
DARMA(p,N+l)  process;  the  autoregression  can  be  made  explicit, 
as  in  the  usual  (normal  theory)  linear  processes. 

Thus  let 


(2.17) 


n  n  n-A  n  n-D_ 

n  n 


where  {V  },  {A  } ,  and  {D  }  are  as  before.  Thus,  with  prob- 
n  n  n 

ability  p  ,  is  one  of  the  p  previous  values 

, .  .  .  ,X^_p  and  with  probability  (1-p)  it  is  a  mixture  of 
the  previous  »  n  -  N  £  k  ^  n.  Note  that  if  p  =  0, 

then  "  1,2,...}  is  a  DMA(N)  process  as  defined  in 

Jacobs  and  Lewis  [1978a].  If  =  0}  =  1,  then  i® 

a  DAR(p)  process  as  defined  in  Jacobs  and  Lewis  [1978c]. 

Let  T  =  inf{i  ;  6^  >  0}  .  Note  that 
^  =  {(Xj^  ,  ,  n  =  1,2,...} 

is  a  Markov  Chain  with  state  space  ]F  which  is  equal  to  the 
product  space  of  ]E  with  itself  p  +  (N-x)  times.  Since 


p{x 


n+T+1 


=  y 


n+l 


>  (l-p)6^  Tr(j) 


there  is  a  set  JcilF 


such  that 


-  )sl?^  '  i>  '  V  >  0  . 


where 

K  =  P  +  N  and 

^  ^  ^^n+K  ^  ^n+K-T'  ^n+K-1  ^n+K-l-x'**"  ^n+K-p+1  '^n+K-p+l-x^ 

Thus  the  condition  of  case  (b)  on  page  173  of  Doob  [1953]  is 
satisfied.  The  proof  on  pagesl73  and  174  extended  to  countable 
state  spaces  shows  that  has  a  limiting  probability  mass 

function  v  as  n  ->  <»  ;  further  tlie  convergence  of  the  con¬ 
ditional  distribution  of  to  v  as  n -*■  «>  is  geometric. 

The  marginal  probability  mass  function  of  v  is  tt  . 

It  follows  from  (2.17)  that  the  serial  correlations 
for  the  stationary  NDARMA(p,N)  process  satisfy  the  Yule-Walker 

equations  for  the  ARMA(p,N)  process  with  restrictions  on  the 
range  of  the  coefficients; 

(2.19)  ^  Corr(Xj^,  X^^j^) 

P 

'  P“i  Corr(X^, 

+  (1-p)  S.Corr(x„  , 
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for  k  >  0  .  The  correlations  r„„(i)  =  Corr(x  ,  Y  ) 

n  '  n-i 

be  computed  recursively  as  follows.  For  i  =  o 

(2-20)  =  C1-p)6q  ; 

for  1  £  i  <  p 

=  Cl-p)6^  +  +  ...  + 

and  for  i  >  p 

P 

=  (1-P)6^  +  ' 

where  if  i  >  n  ,  then  6^  =  0  by  convention. 
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Hence#  if  we  assume  N  <  p 


(2.21)  +  pa2rjj(k-2)  +  ...  +  papr^jCp-k) 

+  (1-P>  ! 

i=k 


for  1  ^  k  ^  N  ; 

(2.22)  ~  paj^rjj(k-l)  +  pa2rjj(k-2)  +  ...  +  paprj^(k-p) 

for  k  >  N  . 

The  serial  correlations  of  the  NDARMA(p,N)  process  are 
nonnegative  since,  if  is  the  probability  that  and 

X^^^  choose  the  same  random  variable  <  k  £  n  ,  then 

{qj^(i)}  satisfies  equations  (2.21)  and  (2.22).  The  argument 
is  the  same  as  for  the  DARMA(p,N+l)  case. 

2.3  Comparison  of  Admissible  Range  of  Correlations  for  the 
DARMA(1,1)  and  the  NDARMA(1,1). 

Let  {Xj^}  be  a  stationary  DARMA(1,1)  process;  that  is. 


{ 


(2.23) 

Y 

n 

with 

probability  3, 

^n  =  i 

Z  1 
n-1 

with 

probability  1-3, 
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where 


(2.24) 


with  probability  p, 
with  probability  1-p. 


Equations  (2.4)  -  (2.12)  for  the  DARMA(p,N+l)  correlations 
simplify  to 

(2.25)  r(k)  =  Corr(X^,  X^_^j^)  =  p’^'^  (l-g)  [6  (1-p)  +  (l-S)p]  . 

Similarly  let  a  stationary  NDARMA(1,1)  process; 

that  is 


1 

K-1 

with  probability 

P, 

(2.26) 

^n  =  ■ 

Y 

n 

with  probability 

(l-p)6o, 

.^n-l 

with  probability 

(1-P)(1-6q)  . 

Equations 

(2.21) 

and  (2.22) 

simplify  to 

(2.27) 

rj^(k) 

=  Corr(Xj^, 

(1-P)^6q(1-6q] 

Figure  1  gives  graphs  of  the  attainable  values  of 
{rj^(2),  r^jd)}  as  the  parameter  values  p  and  6q  vary,  and 
{r(2),  r(l)}  as  the  parameters  p  and  3  vary.  Note  that 
although  the  set  of  attainable  correlations  for  the  NDARMA(1,1) 
process  is  not  strictly  contained  in  that  for  the  DARMA(1,1) 
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process,  it  is  much  smaller.  Thus,  the  DARMA(1,1)  model  appears 
to  be  broader  than  the  NDARMA(1,1)  model.  The  smaller  region 
of  possible  correlation  pairs  for  the  NDARMA(1,1)  model  seems  to 
be  a  constraint  due  to  the  explicit  autoregression  on  ^  . 

2.4  An  Estimator  for  the  Serial  Correlations  of  the 
DARMA(p,N+l)  and  NDARMA(p,N)  processes. 

The  usual  estimator  for  the  serial  correlations  of  a 
stationary  real-valued  sequence  {X, , . . . ,  X  }  is 

1  Itl 

^  2  1  1  in*  ^ 

(2.28)  r(^)  =  is^r  (m-i)"-^  (X^  -  X)  -  X)  , 

where 

-1  ^ 

(2.29)  X  =  m  I  X. 

j=l  => 

and 

In  this  subsection  we  will  suggest  another  estimator 
for  the  serial  correlations  of  the  DARMA  and  NDARMA  processes. 

By  the  remarks  at  the  ends  of  Sections  2.1  and  2.2, 
the  £th  serial  correlation,  r(£),  for  both  the  stationary 
DARMA(p,N+l)  and  NDARMA(p,N)  processes  is  equal  to  the  prob¬ 
ability  that  X^  and  choose  the  same  Yj^  ,  k  ^  n  . 

Hence,  for  both  processes  for  i  /  j. 
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T 


>  j)  -  PiY, 


=  1  , 


■R 


n+l 


=  j) 


=  I  I  P{Rn  = 

k<,n  r<_n  +  Ji 

r^k 


"n+ll  =  '“<^k  *  i-  ’'r 


Since  the  {Y^}  random  variables  are  independent,  and  independent 

of  R  and  R  . .  f  we  have 
n  n+£ 


(2.31) 


~  \  +  Z  =  j>  =  TT(i)TT(j)p{R^  ^ 


=  T7(i)TT(j)  [1  -  r(i)  ]  , 


by  equation  (2.16). 
Thus,  for 


(2.32)  lim  B  (m,j) 

N^oo 


almost  surely  where 
Hence 


j  6  IE 


'  lim  (N-m)~^  I 

]^->oo 

=  [1— IT  ( j)  ][l-r(m)  ] 
Ij^(x)  =  1  if  X 


N-m 
k 


III 

I, 


IT  (  j) 

=  i  and  0  otherwise. 


(2.33)  r(m)  =  1  -  I  B  (m,j)[l  -  n(j)]~^ 

j  €IE 

is  a  strongly  consistent  estimator  for  the  mth  serial  correla¬ 
tion  for  the  stationary  DARMA  and  NDARMA  processes. 
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Estimator  r (m)  is  also  a  strongly  consistent  estimator  for 
the  finite  state  space  models  of  Lindqvist (1978)  and  Pegram 
(1980)  since  the  conditional  probabilities  (1.2)  and  (1.6) 
are  of  the  same  form  as  those  for  the  appropriate  DAR(p) 
process . 

In  the  next  section  we  pursue  this  estimator  for  the 
special  case  of  the  first-order  autoregressive  process  DAR(l) . 


19 


3.  ESTIMATION  FOR  THE  DAR(l)  PROCESS 
Let  ^  stationary  DAR(l)  process  with  state 

space  IE =  {0,1,...}  and  first-order  serial  correlation  p  , 

0  £  p  <  1  ;  that  is, 

with  probability  p  , 
with  probability  1-p 

for  n  =  1,2,...,  while  is  a  random  variable  independent 

of  but  with  the  same  probability  mass  function. 

A  simulation  was  conducted  to  study  the  performance  of 
several  estimator  for  p  for  small  to  moderate  series  lengths 
m  .  The  series  lengths  considered  were  m  =  20,  50,  and  200. 
The  marginal  probability  mass  functions  considered  were  the 
Poisson  with  parameter  A  , 

-A  A^ 

(3.2)  n(k)  =  e  ^  ^  k  =  0,1,...  , 

and  the  geometric  with  parameter  p 

(3.3)  ’iT(k)  =  p  ( 1— p)  k  =  0,1,... 

One  type  of  estimator  considered  was  the  single  param¬ 
eter  maximum  likelihood  estimator.  For  a  series  of  length  m 
let  N^j  denote  the  number  of  times  the  DAR(l)  process  goes 
from  i  to  j  ,  for  i,  je3E  and  let  N^  denote  the  total 
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number  of  times  the  process  is  in  state  j  •  The  log-likelihood 


function  for  a  DAR(l)  series  ^  ,X  }  of  length  m  is 

1  m 


(3.4) 


L  =  I  I  -i  (l-p)  Tr(  j)  ] 
i=0  jfii 


+  ^  N.  .  Jln[l  -  (l-p)Tr(i)  ] 

i=0  ^ 


+  y  l.(X,)Tr{i) 

i-O  ^  ^ 


Taking  the  partial  derivative  of  L  with  respect  to  x  =  1-p 
and  setting  the  derivative  equal  to  zero  results,  after  some 

simplification,  in  the  following  equation  for  the  maximum  likeli- 

/\  /^ 

hood  estimator  x  =  1-p  ,  if  it  exists: 


(3.5) 


f  (x) 


1  -  N 


-1 


I 

i=0 


Nii 


{1  -  x[l  -  TT(i)]}~^  =  0 


Note  that  f(x)  is  monotone  decreasing  in  x  and  f(0)  ^  0  . 
Hence,  if  there  is  a  solution  to  (3.5)  in  [0,1],  it  will  be 
unique. 

The  ad-hoc  estimator  of  p  given  at  (2.33)  was  also 
considered.  For  the  DAR(l)  process  this  estimator  is 


(3.6) 


P  =  1 


oo 


I 

j=0 


I 

i=0 


N.  .][1  -  Tr(j)  ]"^  ^ 

13  /  • 
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The  summations  in  both  the  maximum  likelihood  estimators 
and  the  estimators  of  the  form  (3.6)  for  the  Poisson  case 
(respectively  the  geometric  case)  were  restricted  to  be  between 
a  =  max[l,  n_(vi-7o)  ]  (respectively  a  =  max(l,  n_(y-10o)) 
and  b  =  n_j_(M+7a)  (respectively  n^  (y+lOa)  )  ;  here  n_(y)  (respectively 
n^(y))  is  the  largest  (respectively  smallest)  integer  less 
(respectively  greater)  than  y  . 

Equation  (3.5)  was  solved  numerically.  In  the  case 
N  =  20  ,  it  was  not  uncommon  that  f(x)  did  not  have  a  zero  in 
[0,1].  In  this  case,  if  f(l)  >  0  ,  then  x  was  set  equal  to 

/N 

1  ;  that  is,  p=0.  If  f(0)=0,  then  x  was  taken  to  be 
0  ;  that  is,  p  =  1  . 

Other  estimators  for  p  that  were  considered  included 
the  following : 

1.  The  usual  estimate  for  first-order  serial  correlation, 

1  i 

(3.7)  =  [S^]"  (m-1)"-^  I  (X^  -  -  X)  j 

n=l 

-  2  2 
where  X  and  are  as  in  (2.29)  and  (2.30)  .  If  ^  0  , 

then  was  set  equal  to  1  . 

2.  The  maximum  likelihood  equation  (3.5)  was  solved  numer¬ 
ically  for  p  for  each  of  the  following  three  values  for  TT(j)  : 

a.  the  known  distribution  (3.2)  or  (3.3)  with  known 
parameter  A  or  p  was  used;  P2  denotes  this  estimator; 

b.  the  known  distribution  with  an  estimated  parameter 
was  used;  that  is, 
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(3.8) 


-A  (A) 


wh/'re 


(3.9) 


^  Ic 

p""  (1-p) 


where 


X[1  +  x]'^ 


denotes  this  estimator; 

-1 

c.  the  nonparametric  estimator  7r(j)  =  N  of 

Tr(j)  was  used;  denotes  this  estimator. 

d.  Pg  is  the  estimator  of  p  resulting  from  the  two- 
dimensional  maximum  likelihood  estimator  where  the  other 
parameter  is  the  distribution  parameter  (A  or  p) ; 

3.  the  estimate  pg  is  the  nonparametric  estimate 

(3.6)  using  as  the  estimate  for  iT(j)  ; 

4.  the  estimate  p^  is  the  nonparametric  estimate 
(3.6)  using  the  true  value  of  iT(j)  ; 

Both  estimators  Pg  and  p^  can  have  negative  values 
for  small  to  moderate  sample  sizes.  Hence,  we  also  considered  the 
following  estimate. 

5.  Estimator  pg  =  max(pg  ,  0  ) . 


3.2  The  sampling  experiment. 

A  DAR(l)  series  of  length  m  was  simulated  and  the 
estimates  for  p  were  computed.  The  computation  was  repeated 
for  1000  independent  replications  and  the  sample  mean,  sample 
variance,  and  sample  root  mean  square  error  were  computed.  Each 
experiment  was  then  repeated  for  20  independent  replications. 
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and  the  mean  of  the  means,  mean  of  the  standard  deviations,  and 
mean  of  the  root  mean  square  errors  were  computed.  Tables  1-3 
give  the  means  of  the  root  mean  square  errors  for  the  cases 
studied.  The  box  plots  of  row  values  appearing  in  the  last 
column  of  the  table  are  given  to  help  the  reader  to  summarize 
the  performance  of  the  8  estimators  across  the  7  cases  considered. 

All  runs  were  performed  on  an  IBM  system  360/67  computer 
at  the  Naval  Postgraduate  School  using  the  LLRANDOM  package 
[Learmonth  and  Lewis,  1973]  which  generates  numbers  according 
to  the  scheme  given  by  Lewis,  Goodman,  and  Miller  [1969].  Tests 
of  the  random  number  generator  are  given  in  Learmonth  and  Lewis 
[1974]. 

Among  all  the  estimates  of  p  ,  the  usual  first-order 
serial  correlation  estimator,  ,  performs  least  well.  The 
maximum  likelihood  estimator  with  smallest  root  mean  square 
error  tends  to  be  P2  ,  although  by  the  time  m  =  200  the 
difference  is  minor.  Of  course  the  value  of  p  or  A  would 
not  be  known  in  general,  so  that  this  estimator  is  unrealistic. 
Maximum  likelihood  estimators  p^  and  are  about  equiva¬ 

lent,  indicating  that  the  extra  computational  complexity  of 
the  two-parameter  maximum  likelihood  estimator,  Pg  ,  is  not 
necessary.  The  performance  of  the  nonparametric  estimator  Pg 
is  cibout  the  same  or  sometimes  better  than  that  of  the  maximum 
likelihood  estimator  p^  ,  especially  if,  for  small  sample  size 
(m=20) ,  the  modification  pg  =  max(pg,0)  is  used.  This  sug¬ 
gests  that  the  ad-hoc  estimator  of  (3.6)  altered  to  give  values 
in  the  range  [0,1]  is  almost  as  good  as  the  maximum  likelihood 
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estimator  with  the  same  estimate  for  7r(j)  .  The  ad-hoc 
estimator  is  much  easier  to  compute  than  the  maximum  likelihood 
estimator. 


Estimated  Root  Mean  Square  Error  for  several 
competing  estimators  of  r*  for  DAR(l)  Series  of  length 


;ed  Root  Mean  Square  Error  for  several 
;ors  of  p  for  DAR(l)  Series  of  length 


Ad  hoc  estimator;  .068  .101  .088  .072  .121  .092  .131 

iiiax(p,,0)  . 


4.  EXTENSIONS 


Since  the  process  is  obtained  as  a  probabilistic 

mixture  of  the  DARMA  and  NDABMA  process  may  be  defined 

using  any  sequence  of  independent  identically  distributed  random 

variables  .  One  implication  is  that  DARMA  and  NDARMA 

processes  may  have  a  continuous  marginal  distribution.  However, 

even  if  the  distribution  of  is  continuous,  a  realization 

n 

of  the  sequence  will,  in  general,  contain  many  runs  of  a 

single  value.  This  seems  to  be  the  major  drawback  to  using 
DARMA  and  NDARMA  processes  to  obtain  a  sequence  of  dependent 
random  variables  with  a  specified  continuous  distribution  and 
correlation  structure.  However,  the  process  with  continuous 
marginals  may  be  useful  in  simulation  studies. 

Multivariate  DARMA  and  NDARMA  processes  may  be  obtained 
by  using  a  sequence  of  multivariate  •  To  illustrate  this 

we  generate  DARMA  and  NDARMA-like  processes  having  negative 
correlations.  These  can  be  derived  from  bivariate  processes 
as  follows. 

Let  {(Y^(l),  Y^(-l))}  be  a  sequence  of  independent 
bivariate  random  variables  with  state  space  IE  =  {0, +!»•••}  / 
marginal  probability  mass  function  tt  ,  and  correlation 
r  =  Corr(Y^(l) ,  Y^(-l))  which  will  be  negative  in  general.  One 
way  to  generate  such  a  sequence  is  to  note  that  a  random  vari¬ 
able  with  probability  mass  function  tt  can  be  simulated 

from  a  uniform  [0,1]  random  variable  by  defining 

j-1  3 

(4.1)  Y^(l)  =  j  if  I  TT(i)  <  U  <  ^  TT(i)  . 

OO  c» 
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If  Y^(-l)  is  generated  by 

j-i  j 

(4.2)  =3  I  <  1  -  U  <  I  7r(i)  , 

“  i=*°“  i=~^ 

then  (Y  (1),  Y  (-1))  is  called  an  antithetic  pair.  If  tt  is 

symmetric  about  zero,  then 

r  =  Corr(Y^(l),  Y^^C-l)  )  =  -1  . 

A  bivariate  DARMA  (p,N+l)  process  {(X„(l),  X  (-1))} 

n  n 

is  defined  as  follows.  Let  {aQ,...,ap}  and  {bQ,...,b^}  be 
fixed  sequences  of  numbers  that  are  either  -1  or  1  .  Let 

(4.3)  X„(l)  -  4  (ao) 

(4.4)  X„(-l)  =  (-b^  )  4  (l-U„)Z„.(N4l)  <-“0> 

n  n 

for  n  =  1, 2, . . .  , 
where 

(4.5)  Z^(a„)  -  V„Z„.^  (a^  )  4  (l-V„)Y„(a„) 

n  n 


(4.6)  Z„(-a„)  =  (-a^  )  4  (l-v^)Y„(-a„) 

n  n 


for  n  =  -N,  -N+1,...  where  {A^}  and  {D^}  are  as  in  Section 

n  n 

2.  The  random  variable  X„(-l)  is  called  the  dual  of  X  (1) 

n  n 
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or  the  antithetic  when  (4.1)  and  (4.2)  hold  for  the  {Y  }  pair. 

n 

Note  that  if  (4.1)  and  (4.2)  hold  and  tt  is  syininetric  about 

zero,  then  Z  (-1)  =  -  Z_(l)  and  X„(-l)  =  -  x  (1)  . 

n  n  n  n 

A  bivariate  NDARMA(p,q)  proce  .  (X^(l),  X^(-l))  can 

be  defined  similarly: 


(4.7) 


X  (1)  = 
n 


V  X' 
n  n-A 


>A  '  +tt-''n>VD  ' 
n  n 


(4.8)  X„(-l)  =  (-a^  )  4  (1-V„)Y„.„  (-bj,  ) 

n  n  n  n 


The  stationary  bivariate  DARI4A  and  NDARMA  processes 
will  have  marginal  probability  mass  function  it  .  A  process 
having  possibly  negative  correlations  can  be  obtained  by  consider 
the  marginal  processes  (X^^d)},  {X^(-l)},  {X^^d)},  {X^(-l)  }  . 
Details  will  be  given  elsewhere. 
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